The webuse command can load the directly.
. webuse margex
(Artificial data for margins)
. summarize, sep(0)
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
y | 3,000 69.73357 21.53986 0 146.3
outcome | 3,000 .1696667 .3754023 0 1
sex | 3,000 .5006667 .5000829 0 1
group | 3,000 1.828 .7732714 1 3
age | 3,000 39.799 11.54174 20 60
distance | 3,000 58.58566 181.3987 .25 748.8927
ycn | 3,000 74.47263 22.01264 0 153.8
yc | 3,000 .2413333 .4279633 0 1
treatment | 3,000 .5006667 .5000829 0 1
agegroup | 3,000 2.263667 .8316913 1 3
arm | 3,000 1.970667 .7899457 1 3
We’ll load the data from Stata’s website via the haven package. We’ll also convert to standard R data.frame and factors.
library(haven)
m <- haven::read_dta("http://www.stata-press.com/data/r15/margex.dta")
m <- data.frame(m)
m$sex <- as_factor(m$sex)
m$group <- as_factor(m$group)
head(m) y outcome sex group age distance ycn yc treatment agegroup arm
1 102.6 1 female 1 55 11.75 97.3 1 1 3 1
2 77.2 0 female 1 60 30.75 78.6 0 1 3 1
3 80.5 0 female 1 27 14.25 54.5 0 1 1 1
4 82.5 1 female 1 60 29.25 98.4 1 1 3 1
5 71.6 1 female 1 52 19.00 105.3 1 1 3 1
6 83.2 1 female 1 49 2.50 89.7 0 1 3 1
. regress y i.group
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(2, 2997) = 15.48
Model | 14229.0349 2 7114.51743 Prob > F = 0.0000
Residual | 1377203.97 2,997 459.527518 R-squared = 0.0102
-------------+---------------------------------- Adj R-squared = 0.0096
Total | 1391433.01 2,999 463.965657 Root MSE = 21.437
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
group |
2 | .4084991 .8912269 0.46 0.647 -1.338979 2.155977
3 | 5.373138 1.027651 5.23 0.000 3.358165 7.38811
|
_cons | 68.35805 .6190791 110.42 0.000 67.14419 69.57191
------------------------------------------------------------------------------
. margins group
Adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
group |
1 | 68.35805 .6190791 110.42 0.000 67.14419 69.57191
2 | 68.76655 .6411134 107.26 0.000 67.50948 70.02361
3 | 73.73119 .8202484 89.89 0.000 72.12288 75.33949
------------------------------------------------------------------------------
Call:
lm(formula = y ~ group, data = m)
Residuals:
Min 1Q Median 3Q Max
-73.531 -14.758 0.101 14.536 72.569
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 68.3580 0.6191 110.419 < 2e-16 ***
group2 0.4085 0.8912 0.458 0.647
group3 5.3731 1.0277 5.229 1.83e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared: 0.01023, Adjusted R-squared: 0.009566
F-statistic: 15.48 on 2 and 2997 DF, p-value: 2.045e-07
group emmean SE df lower.CL upper.CL
1 68.4 0.619 2997 67.1 69.6
2 68.8 0.641 2997 67.5 70.0
3 73.7 0.820 2997 72.1 75.3
Confidence level used: 0.95
Assume a linear regression set up with a single categorical variable, \(G\), with three groups. We fit
\[ E(Y|G) = \beta_0 + \beta_1g_2 + \beta_2g_3, \]
where \(g_2\) and \(g_3\) are dummy variables representing membership in groups 2 and 3 respectively (\(g_{2i} = 1\) if observation \(i\) has \(G = 2\), and equal to 0 otherwise.) Since \(g_1\) is not found in the model, it is the reference category.
Therefore, \(\beta_0\) represents the average response among the reference category \(G = 1\). \(\beta_1\) represents the difference in the average response between groups \(G = 1\) and \(G = 2\). Therefore, \(\beta_0 + \beta_1\) is the average response in group \(G = 2\). A similar argument can be made about \(\beta_2\) and group 3.
. regress y i.sex##i.group
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(5, 2994) = 92.91
Model | 186898.199 5 37379.6398 Prob > F = 0.0000
Residual | 1204534.81 2,994 402.316235 R-squared = 0.1343
-------------+---------------------------------- Adj R-squared = 0.1329
Total | 1391433.01 2,999 463.965657 Root MSE = 20.058
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
female | 21.62507 1.509999 14.32 0.000 18.66433 24.58581
|
group |
2 | 11.37444 1.573314 7.23 0.000 8.289552 14.45932
3 | 21.6188 1.588487 13.61 0.000 18.50416 24.73344
|
sex#group |
female#2 | -4.851582 1.942744 -2.50 0.013 -8.66083 -1.042333
female#3 | -6.084875 3.004638 -2.03 0.043 -11.97624 -.1935115
|
_cons | 50.6107 1.367932 37.00 0.000 47.92852 53.29288
------------------------------------------------------------------------------
. margins sex
Predictive margins Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
male | 59.77145 .6454388 92.61 0.000 58.5059 61.037
female | 78.20318 .7105463 110.06 0.000 76.80997 79.59639
------------------------------------------------------------------------------
. margins sex#group
Adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex#group |
male#1 | 50.6107 1.367932 37.00 0.000 47.92852 53.29288
male#2 | 61.98514 .7772248 79.75 0.000 60.46119 63.50908
male#3 | 72.2295 .8074975 89.45 0.000 70.64619 73.8128
female#1 | 72.23577 .63942 112.97 0.000 70.98203 73.48952
female#2 | 78.75863 .9434406 83.48 0.000 76.90877 80.60849
female#3 | 87.7697 2.468947 35.55 0.000 82.92869 92.6107
------------------------------------------------------------------------------
Call:
lm(formula = y ~ sex * group, data = m)
Residuals:
Min 1Q Median 3Q Max
-72.029 -13.285 0.464 13.741 67.964
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 50.611 1.368 36.998 < 2e-16 ***
sexfemale 21.625 1.510 14.321 < 2e-16 ***
group2 11.374 1.573 7.230 6.12e-13 ***
group3 21.619 1.588 13.610 < 2e-16 ***
sexfemale:group2 -4.852 1.943 -2.497 0.0126 *
sexfemale:group3 -6.085 3.005 -2.025 0.0429 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared: 0.1343, Adjusted R-squared: 0.1329
F-statistic: 92.91 on 5 and 2994 DF, p-value: < 2.2e-16
group emmean SE df lower.CL upper.CL
1 61.4 0.755 2994 59.9 62.9
2 70.4 0.611 2994 69.2 71.6
3 80.0 1.299 2994 77.5 82.5
Results are averaged over the levels of: sex
Confidence level used: 0.95
group sex emmean SE df lower.CL upper.CL
1 male 50.6 1.368 2994 47.9 53.3
2 male 62.0 0.777 2994 60.5 63.5
3 male 72.2 0.807 2994 70.6 73.8
1 female 72.2 0.639 2994 71.0 73.5
2 female 78.8 0.943 2994 76.9 80.6
3 female 87.8 2.469 2994 82.9 92.6
Confidence level used: 0.95
Consider the simplest case, with two binary variables \(Z\) and \(K\). We fit the model
\[ E(Y|Z,K) = \beta_0 + \beta_1Z + \beta_2K + \beta_3ZK, \]
where \(ZK = 0\) only if both \(Z = 1\) and \(K = 1\).
This is functionally equivalent to defining a new variable \(L\),
\[ L = \begin{cases} 0, & K = 0 \& Z = 0 \\ 1, & K = 0 \& Z = 1 \\ 2, & K = 1 \& Z = 0 \\ 3, & K = 1 \& Z = 1, \end{cases} \]
and fitting the model
\[ E(Y|L) = \beta_0 + \beta_1l_1 + \beta_2l_2 + \beta_3l_3. \]
. regress y i.sex c.age
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(2, 2997) = 209.88
Model | 170938.657 2 85469.3283 Prob > F = 0.0000
Residual | 1220494.35 2,997 407.238688 R-squared = 0.1229
-------------+---------------------------------- Adj R-squared = 0.1223
Total | 1391433.01 2,999 463.965657 Root MSE = 20.18
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
female | 14.03271 .7777377 18.04 0.000 12.50775 15.55766
age | -.5043666 .033698 -14.97 0.000 -.5704402 -.4382931
_cons | 82.78115 1.323613 62.54 0.000 80.18586 85.37643
------------------------------------------------------------------------------
. margins sex, at(age = (30 40))
Adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
1._at : age = 30
2._at : age = 40
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at#sex |
1#male | 67.65015 .5604889 120.70 0.000 66.55116 68.74913
1#female | 81.68285 .6911128 118.19 0.000 80.32775 83.03796
2#male | 62.60648 .537682 116.44 0.000 61.55222 63.66074
2#female | 76.63919 .533784 143.58 0.000 75.59257 77.68581
------------------------------------------------------------------------------
For each unique level of the continuous variable, a seperate emmeans call is needed apparently. If anyone knows a way around this, please let me know.
Call:
lm(formula = y ~ sex + age, data = m)
Residuals:
Min 1Q Median 3Q Max
-71.99 -13.88 -0.15 13.76 75.28
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.7811 1.3236 62.54 <2e-16 ***
sexfemale 14.0327 0.7777 18.04 <2e-16 ***
age -0.5044 0.0337 -14.97 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.18 on 2997 degrees of freedom
Multiple R-squared: 0.1229, Adjusted R-squared: 0.1223
F-statistic: 209.9 on 2 and 2997 DF, p-value: < 2.2e-16
sex emmean SE df lower.CL upper.CL
male 67.7 0.560 2997 66.6 68.7
female 81.7 0.691 2997 80.3 83.0
Confidence level used: 0.95
sex emmean SE df lower.CL upper.CL
male 62.6 0.538 2997 61.6 63.7
female 76.6 0.534 2997 75.6 77.7
Confidence level used: 0.95
. regress y i.sex##c.age
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(3, 2996) = 139.91
Model | 170983.675 3 56994.5583 Prob > F = 0.0000
Residual | 1220449.33 2,996 407.35959 R-squared = 0.1229
-------------+---------------------------------- Adj R-squared = 0.1220
Total | 1391433.01 2,999 463.965657 Root MSE = 20.183
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
female | 14.92308 2.789012 5.35 0.000 9.454508 20.39165
age | -.4929608 .0480944 -10.25 0.000 -.5872622 -.3986595
|
sex#c.age |
female | -.0224116 .0674167 -0.33 0.740 -.1545994 .1097762
|
_cons | 82.36936 1.812958 45.43 0.000 78.8146 85.92413
------------------------------------------------------------------------------
. margins sex, at(age = (30 40))
Adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
1._at : age = 30
2._at : age = 40
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at#sex |
1#male | 67.58054 .5984013 112.94 0.000 66.40722 68.75386
1#female | 81.83127 .8228617 99.45 0.000 80.21784 83.4447
2#male | 62.65093 .5541361 113.06 0.000 61.56441 63.73746
2#female | 76.67755 .5461908 140.39 0.000 75.6066 77.74849
------------------------------------------------------------------------------
Call:
lm(formula = y ~ sex * age, data = m)
Residuals:
Min 1Q Median 3Q Max
-71.817 -13.848 -0.116 13.758 75.263
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.36936 1.81296 45.434 < 2e-16 ***
sexfemale 14.92308 2.78901 5.351 9.42e-08 ***
age -0.49296 0.04809 -10.250 < 2e-16 ***
sexfemale:age -0.02241 0.06742 -0.332 0.74
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.18 on 2996 degrees of freedom
Multiple R-squared: 0.1229, Adjusted R-squared: 0.122
F-statistic: 139.9 on 3 and 2996 DF, p-value: < 2.2e-16
sex emmean SE df lower.CL upper.CL
male 67.6 0.598 2996 66.4 68.8
female 81.8 0.823 2996 80.2 83.4
Confidence level used: 0.95
sex emmean SE df lower.CL upper.CL
male 62.7 0.554 2996 61.6 63.7
female 76.7 0.546 2996 75.6 77.7
Confidence level used: 0.95
. regress y i.sex c.age
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(2, 2997) = 209.88
Model | 170938.657 2 85469.3283 Prob > F = 0.0000
Residual | 1220494.35 2,997 407.238688 R-squared = 0.1229
-------------+---------------------------------- Adj R-squared = 0.1223
Total | 1391433.01 2,999 463.965657 Root MSE = 20.18
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
female | 14.03271 .7777377 18.04 0.000 12.50775 15.55766
age | -.5043666 .033698 -14.97 0.000 -.5704402 -.4382931
_cons | 82.78115 1.323613 62.54 0.000 80.18586 85.37643
------------------------------------------------------------------------------
. margins, dydx(age)
Average marginal effects Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
dy/dx w.r.t. : age
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -.5043666 .033698 -14.97 0.000 -.5704402 -.4382931
------------------------------------------------------------------------------
Call:
lm(formula = y ~ sex + age, data = m)
Residuals:
Min 1Q Median 3Q Max
-71.99 -13.88 -0.15 13.76 75.28
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.7811 1.3236 62.54 <2e-16 ***
sexfemale 14.0327 0.7777 18.04 <2e-16 ***
age -0.5044 0.0337 -14.97 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.18 on 2997 degrees of freedom
Multiple R-squared: 0.1229, Adjusted R-squared: 0.1223
F-statistic: 209.9 on 2 and 2997 DF, p-value: < 2.2e-16
1 age.trend SE df lower.CL upper.CL
overall -0.504 0.0337 2997 -0.57 -0.438
Results are averaged over the levels of: sex
Confidence level used: 0.95
. regress y i.sex##c.age
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(3, 2996) = 139.91
Model | 170983.675 3 56994.5583 Prob > F = 0.0000
Residual | 1220449.33 2,996 407.35959 R-squared = 0.1229
-------------+---------------------------------- Adj R-squared = 0.1220
Total | 1391433.01 2,999 463.965657 Root MSE = 20.183
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
female | 14.92308 2.789012 5.35 0.000 9.454508 20.39165
age | -.4929608 .0480944 -10.25 0.000 -.5872622 -.3986595
|
sex#c.age |
female | -.0224116 .0674167 -0.33 0.740 -.1545994 .1097762
|
_cons | 82.36936 1.812958 45.43 0.000 78.8146 85.92413
------------------------------------------------------------------------------
. margins sex, dydx(age)
Average marginal effects Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
dy/dx w.r.t. : age
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |
sex |
male | -.4929608 .0480944 -10.25 0.000 -.5872622 -.3986595
female | -.5153724 .0472435 -10.91 0.000 -.6080054 -.4227395
------------------------------------------------------------------------------
Call:
lm(formula = y ~ sex * age, data = m)
Residuals:
Min 1Q Median 3Q Max
-71.817 -13.848 -0.116 13.758 75.263
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.36936 1.81296 45.434 < 2e-16 ***
sexfemale 14.92308 2.78901 5.351 9.42e-08 ***
age -0.49296 0.04809 -10.250 < 2e-16 ***
sexfemale:age -0.02241 0.06742 -0.332 0.74
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.18 on 2996 degrees of freedom
Multiple R-squared: 0.1229, Adjusted R-squared: 0.122
F-statistic: 139.9 on 3 and 2996 DF, p-value: < 2.2e-16
sex age.trend SE df lower.CL upper.CL
male -0.493 0.0481 2996 -0.587 -0.399
female -0.515 0.0472 2996 -0.608 -0.423
Confidence level used: 0.95
. regress y i.group
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(2, 2997) = 15.48
Model | 14229.0349 2 7114.51743 Prob > F = 0.0000
Residual | 1377203.97 2,997 459.527518 R-squared = 0.0102
-------------+---------------------------------- Adj R-squared = 0.0096
Total | 1391433.01 2,999 463.965657 Root MSE = 21.437
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
group |
2 | .4084991 .8912269 0.46 0.647 -1.338979 2.155977
3 | 5.373138 1.027651 5.23 0.000 3.358165 7.38811
|
_cons | 68.35805 .6190791 110.42 0.000 67.14419 69.57191
------------------------------------------------------------------------------
. margins group, pwcompare(pv)
Pairwise comparisons of adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
-----------------------------------------------------
| Delta-method Unadjusted
| Contrast Std. Err. t P>|t|
-------------+---------------------------------------
group |
2 vs 1 | .4084991 .8912269 0.46 0.647
3 vs 1 | 5.373138 1.027651 5.23 0.000
3 vs 2 | 4.964638 1.041073 4.77 0.000
-----------------------------------------------------
Note that pairs is a generic, meaning that emmeans::pairs is an invalid call. Despite this, the “emmeans” package is required to call the below.
Call:
lm(formula = y ~ group, data = m)
Residuals:
Min 1Q Median 3Q Max
-73.531 -14.758 0.101 14.536 72.569
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 68.3580 0.6191 110.419 < 2e-16 ***
group2 0.4085 0.8912 0.458 0.647
group3 5.3731 1.0277 5.229 1.83e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 21.44 on 2997 degrees of freedom
Multiple R-squared: 0.01023, Adjusted R-squared: 0.009566
F-statistic: 15.48 on 2 and 2997 DF, p-value: 2.045e-07
contrast estimate SE df t.ratio p.value
1 - 2 -0.408 0.891 2997 -0.458 0.6467
1 - 3 -5.373 1.028 2997 -5.229 <.0001
2 - 3 -4.965 1.041 2997 -4.769 <.0001
The adjust = "none" argument skips any multiple comparisons correction. This is done to obtain the same results as the regression coefficients. If you’d prefer a more ANOVA post-hoc approach, there are other options for adjust, see ?emmeans::contrast for details.
. regress y i.group##i.sex
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(5, 2994) = 92.91
Model | 186898.199 5 37379.6398 Prob > F = 0.0000
Residual | 1204534.81 2,994 402.316235 R-squared = 0.1343
-------------+---------------------------------- Adj R-squared = 0.1329
Total | 1391433.01 2,999 463.965657 Root MSE = 20.058
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
group |
2 | 11.37444 1.573314 7.23 0.000 8.289552 14.45932
3 | 21.6188 1.588487 13.61 0.000 18.50416 24.73344
|
sex |
female | 21.62507 1.509999 14.32 0.000 18.66433 24.58581
|
group#sex |
2#female | -4.851582 1.942744 -2.50 0.013 -8.66083 -1.042333
3#female | -6.084875 3.004638 -2.03 0.043 -11.97624 -.1935115
|
_cons | 50.6107 1.367932 37.00 0.000 47.92852 53.29288
------------------------------------------------------------------------------
. margins group#sex, pwcompare(pv)
Pairwise comparisons of adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------
| Delta-method Unadjusted
| Contrast Std. Err. t P>|t|
--------------------------+---------------------------------------
group#sex |
(1#female) vs (1#male) | 21.62507 1.509999 14.32 0.000
(2#male) vs (1#male) | 11.37444 1.573314 7.23 0.000
(2#female) vs (1#male) | 28.14793 1.661722 16.94 0.000
(3#male) vs (1#male) | 21.6188 1.588487 13.61 0.000
(3#female) vs (1#male) | 37.159 2.822577 13.16 0.000
(2#male) vs (1#female) | -10.25064 1.006447 -10.18 0.000
(2#female) vs (1#female) | 6.522856 1.13971 5.72 0.000
(3#male) vs (1#female) | -.0062748 1.030005 -0.01 0.995
(3#female) vs (1#female) | 15.53392 2.550404 6.09 0.000
(2#female) vs (2#male) | 16.77349 1.222358 13.72 0.000
(3#male) vs (2#male) | 10.24436 1.120772 9.14 0.000
(3#female) vs (2#male) | 25.78456 2.588393 9.96 0.000
(3#male) vs (2#female) | -6.529131 1.241826 -5.26 0.000
(3#female) vs (2#female) | 9.011069 2.643063 3.41 0.001
(3#female) vs (3#male) | 15.5402 2.597644 5.98 0.000
------------------------------------------------------------------
. margins sex@group, contrast(pv nowald)
Contrasts of adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------
| Delta-method
| Contrast Std. Err. t P>|t|
--------------------+---------------------------------------
sex@group |
(female vs base) 1 | 21.62507 1.509999 14.32 0.000
(female vs base) 2 | 16.77349 1.222358 13.72 0.000
(female vs base) 3 | 15.5402 2.597644 5.98 0.000
------------------------------------------------------------
To do: All pairwise comparisons
Call:
lm(formula = y ~ group * sex, data = m)
Residuals:
Min 1Q Median 3Q Max
-72.029 -13.285 0.464 13.741 67.964
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 50.611 1.368 36.998 < 2e-16 ***
group2 11.374 1.573 7.230 6.12e-13 ***
group3 21.619 1.588 13.610 < 2e-16 ***
sexfemale 21.625 1.510 14.321 < 2e-16 ***
group2:sexfemale -4.852 1.943 -2.497 0.0126 *
group3:sexfemale -6.085 3.005 -2.025 0.0429 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.06 on 2994 degrees of freedom
Multiple R-squared: 0.1343, Adjusted R-squared: 0.1329
F-statistic: 92.91 on 5 and 2994 DF, p-value: < 2.2e-16
group = 1:
contrast estimate SE df t.ratio p.value
male - female -21.6 1.51 2994 -14.321 <.0001
group = 2:
contrast estimate SE df t.ratio p.value
male - female -16.8 1.22 2994 -13.722 <.0001
group = 3:
contrast estimate SE df t.ratio p.value
male - female -15.5 2.60 2994 -5.982 <.0001
. regress y i.sex##c.age
Source | SS df MS Number of obs = 3,000
-------------+---------------------------------- F(3, 2996) = 139.91
Model | 170983.675 3 56994.5583 Prob > F = 0.0000
Residual | 1220449.33 2,996 407.35959 R-squared = 0.1229
-------------+---------------------------------- Adj R-squared = 0.1220
Total | 1391433.01 2,999 463.965657 Root MSE = 20.183
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex |
female | 14.92308 2.789012 5.35 0.000 9.454508 20.39165
age | -.4929608 .0480944 -10.25 0.000 -.5872622 -.3986595
|
sex#c.age |
female | -.0224116 .0674167 -0.33 0.740 -.1545994 .1097762
|
_cons | 82.36936 1.812958 45.43 0.000 78.8146 85.92413
------------------------------------------------------------------------------
. margins sex, at(age = (20(10)60))
Adjusted predictions Number of obs = 3,000
Model VCE : OLS
Expression : Linear prediction, predict()
1._at : age = 20
2._at : age = 30
3._at : age = 40
4._at : age = 50
5._at : age = 60
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at#sex |
1#male | 72.51015 .9336568 77.66 0.000 70.67947 74.34082
1#female | 86.985 1.22567 70.97 0.000 84.58175 89.38824
2#male | 67.58054 .5984013 112.94 0.000 66.40722 68.75386
2#female | 81.83127 .8228617 99.45 0.000 80.21784 83.4447
3#male | 62.65093 .5541361 113.06 0.000 61.56441 63.73746
3#female | 76.67755 .5461908 140.39 0.000 75.6066 77.74849
4#male | 57.72132 .8477402 68.09 0.000 56.05911 59.38353
4#female | 71.52382 .604927 118.24 0.000 70.33771 72.70994
5#male | 52.79171 1.262091 41.83 0.000 50.31706 55.26637
5#female | 66.3701 .9380502 70.75 0.000 64.53081 68.20939
------------------------------------------------------------------------------
. marginsplot, recastci(rarea)
Variables that uniquely identify margins: age sex
Call:
lm(formula = y ~ sex * age, data = m)
Residuals:
Min 1Q Median 3Q Max
-71.817 -13.848 -0.116 13.758 75.263
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.36936 1.81296 45.434 < 2e-16 ***
sexfemale 14.92308 2.78901 5.351 9.42e-08 ***
age -0.49296 0.04809 -10.250 < 2e-16 ***
sexfemale:age -0.02241 0.06742 -0.332 0.74
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 20.18 on 2996 degrees of freedom
Multiple R-squared: 0.1229, Adjusted R-squared: 0.122
F-statistic: 139.9 on 3 and 2996 DF, p-value: < 2.2e-16